Search Results/Filters    

Filters

Year

Banks




Expert Group











Full-Text


Issue Info: 
  • Year: 

    2021
  • Volume: 

    51
  • Issue: 

    4
  • Pages: 

    443-454
Measures: 
  • Citations: 

    0
  • Views: 

    187
  • Downloads: 

    37
Abstract: 

Multi-label classification aims at assigning more than one label to each instance. Many real-world multi-label classification tasks are high dimensional, leading to reduced performance of traditional classifiers. feature selection is a common approach to tackle this issue by choosing prominent features. Multi-label feature selection is an NP-hard approach, and so far, some swarm intelligence-based strategies and have been proposed to find a near optimal solution within a reasonable time. In this paper, a hybrid intelligence algorithm based on the binary algorithm of particle swarm optimization and a novel local search strategy has been proposed to select a set of prominent features. To this aim, features are divided into two categories based on the extension rate and the relationship between the output and the local search strategy to increase the convergence speed. The first group features have more similarity to class and less similarity to other features, and the second is redundant and less relevant features. Accordingly, a local operator is added to the particle swarm optimization algorithm to reduce redundant features and keep relevant ones among each solution. The aim of this operator leads to enhance the convergence speed of the proposed algorithm compared to other algorithms presented in this field. Evaluation of the proposed solution and the proposed statistical test shows that the proposed approach improves different classification criteria of multi-label classification and outperforms other methods in most cases. Also in cases where achieving higher accuracy is more important than time, it is more appropriate to use this method.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 187

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 37 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2025
  • Volume: 

    13
  • Issue: 

    2
  • Pages: 

    341-356
Measures: 
  • Citations: 

    0
  • Views: 

    8
  • Downloads: 

    0
Abstract: 

feature selection is an important step in data preprocessing, which helps  reducing the dimensionality of data and simplifying the models. This process not only reduces the computational complexity of models, but also improves their accuracy by eliminating irrelevant features and noise. The three most widely used approaches for feature selection are filter, wrapper and embedded methods.  In this paper, first we review some support vector machine based Mixed-Integer Linear Programming (MILP) models and Supervised Infinite feature selection (Inf-FS$_s$) method.  Then, we propose three hybrid approaches based on them. The first approach involves solving the relaxed linear model of the underlying  MILP model and then solving the MILP model for those features with nonzero weights, namely a smaller MILP. In the second approach, first the Inf-FS$_s$ method is applied to rank the features. Then depending on the features costs, either chooses the top features from the ranked features until budget parameter is reached  or solves a knapsack problem to select cost effective features. The third approach applies the first approach to the top $20\%$ of features ranked by Inf-FS$_s$ method. To evaluate the proposed approaches' performance, experiments are conducted on four high-dimensional benchmark datasets for fixed and random features costs. Results demonstrate that using either of the proposed approaches can significantly reduce running time of MILP models with comparable accuracies with the original MILP models.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 8

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2023
  • Volume: 

    1
  • Issue: 

    2
  • Pages: 

    16-28
Measures: 
  • Citations: 

    0
  • Views: 

    35
  • Downloads: 

    0
Abstract: 

feature selection methods are known to be effective in improving the learning process. The purpose of a feature selection method is to identify relevant features and remove irrelevant features in order to obtain a suitable subset of features, so that the redundancy between the selected features is minimized. In multi-label data, if there is a correlation between features, it is possible that the amount of redundancy in the feature set is increased. The existence of redundancy between features along with the challenge of high dimensions of multi-label data can grow the computational calculations, decrease the accuracy and finally increase the probability of errors in the prediction and classification of multi-label data. In this article, with the aim of minimizing the redundancy of features, a multi-label feature selection algorithm is proposed considering the least squares regression model and sparse regularization. Finally, using a number of well-known multi-label data sets, the efficiency of the proposed method is verified and the results are compared with some common multi-label feature selection methods.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 35

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Author(s): 

Journal: 

INFORMATION SCIENCES

Issue Info: 
  • Year: 

    2020
  • Volume: 

    531
  • Issue: 

    -
  • Pages: 

    13-30
Measures: 
  • Citations: 

    1
  • Views: 

    57
  • Downloads: 

    0
Keywords: 
Abstract: 

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 57

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 1 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Author(s): 

ZHU M. | SONG J.

Issue Info: 
  • Year: 

    2013
  • Volume: 

    17
  • Issue: 

    -
  • Pages: 

    1047-1054
Measures: 
  • Citations: 

    1
  • Views: 

    202
  • Downloads: 

    0
Keywords: 
Abstract: 

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 202

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 1 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2024
  • Volume: 

    15
  • Issue: 

    9
  • Pages: 

    271-287
Measures: 
  • Citations: 

    0
  • Views: 

    7
  • Downloads: 

    0
Abstract: 

Data platforms with large dimensions, despite the opportunities they create, create many computational challenges. One of the problems of data with large dimensions is that most of the time, all the characteristics of the data are not important and vital to finding the knowledge that is hidden in them. These features can have a negative effect on the performance of the classification system. An important technique to overcome this problem is feature selection. During the feature selection process, a subset of primary features is selected by removing irrelevant and redundant features. In this article, a hierarchical algorithm based on the coverage solution will be presented, which selects effective features by using relationships between features and clustering techniques. This new method is named GCPSO, which is based on the optimization algorithm and selects the appropriate features by using the feature clustering technique. The feature clustering method presented in this article is different from previous algorithms. In this method, instead of using traditional clustering models, final clusters are formed by using the graphic structure of features and relationships between features. The UCI database has been used to evaluate the proposed method due to its extensive characteristics. The efficiency of the proposed model has also been compared with the feature selection methods based on the coverage solution that uses evolutionary algorithms in the feature selection process. The obtained results indicate that the proposed method has performed well in terms of choosing the optimal subset and classification accuracy on all data sets and in comparison with other methods.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 7

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2019
  • Volume: 

    16
  • Issue: 

    1 (39)
  • Pages: 

    21-40
Measures: 
  • Citations: 

    0
  • Views: 

    805
  • Downloads: 

    0
Abstract: 

The imbalance data can be seen in various areas such as text classification, credit card fraud detection, risk management, web page classification, image classification, medical diagnosis/monitoring, and biological data analysis. The classification algorithms have more tendencies to the large class and might even deal with the minority class data as the outlier data. The text data is one of the areas where the imbalance occurs. The amount of text information is rapidly increasing in the form of books, reports, and papers. The fast and precise processing of this amount of information requires efficient automatic methods. One of the key processing tools is the text classification. Also, one of the problems with text classification is the high dimensional data that lead to the impractical learning algorithms. The problem becomes larger when the text data are also imbalance. The imbalance data distribution reduces the performance of classifiers. The various solutions proposed for this problem are divided into several categories, where the sampling-based methods and algorithm-based methods are among the most important methods. feature selection is also considered as one of the solutions to the imbalance problem. In this research, a new method of one-way feature selection is presented for the imbalance data classification. The proposed method calculates the indicator rate of the feature using the feature distribution. In the proposed method, the one-figure documents are divided in different parts, based on whether they contain a feature or not, and also if they belong to the positive-class or not. According to this classification, a new method is suggested for feature selection. In the proposed method, the following items are used. If a feature is repeated in most positive-class documents, this feature is a good indicator for the positive-class; therefore, this feature should have a high score for this class. This point can be shown as a proportion of positive-class documents that contain this feature. Besides, if most of the documents containing this feature are belonged to the positive-class, a high score should be considered for this feature as the class indicator. This point can be shown by a proportion of documents containing feature that belong to the positive-class. If most of the documents that do not contain a feature are not in the positive-class, a high score should be considered for this feature as the representative of this class. Moreover, if most of the documents that are not in the positive class do not contain this feature, a high score should be considered for this feature. Using the proposed method, the score of features is specified. Finally, the features are sorted in descending order based on score, and the necessary number of required features is selected from the beginning of the feature list. In order to evaluate the performance of the proposed method, different feature selection methods such as the Gini, DFS, MI and FAST were implemented. To assess the proposed method, the decision tree C4. 5 and Naive Bayes were used. The results of tests on Reuters-21875 and WebKB figures per Micro F, Macro F and G-mean criteria show that the proposed method has considerably improved the efficiency of the classifiers than other methods.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 805

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2023
  • Volume: 

    11
  • Issue: 

    1
  • Pages: 

    29-37
Measures: 
  • Citations: 

    0
  • Views: 

    27
  • Downloads: 

    2
Abstract: 

High dimensionality is the biggest problem when working with large datasets. feature selection is a procedure for reducing the dimensionality of datasets by removing additional and irrelevant features; the most effective features in the dataset will remain, increasing the algorithms’ performance. In this paper, a novel procedure for feature selection is presented that includes a binary teaching learning-based optimization algorithm with mutation (BMTLBO). The TLBO algorithm is one of the most efficient and practical optimization techniques. Although this algorithm has fast convergence speed and it benefits from exploration capability, there may be a possibility of trapping into a local optimum. So, we try to establish a balance between exploration and exploitation. The proposed method is in two parts: First, we used the binary version of the TLBO algorithm for feature selection and added a mutation operator to implement a strong local search capability (BMTLBO). Second, we used a modified TLBO algorithm with the self-learning phase (SLTLBO) for training a neural network to show the application of the classification problem to evaluate the performance of the procedures of the method. We tested the proposed method on 14 datasets in terms of classification accuracy and the number of features. The results showed BMTLBO outperformed the standard TLBO algorithm and proved the potency of the proposed method. The results are very promising and close to optimal.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 27

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 2 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Author(s): 

YU M. | YAN G. | ZHU Q.W.

Issue Info: 
  • Year: 

    2006
  • Volume: 

    -
  • Issue: 

    5
  • Pages: 

    3233-3236
Measures: 
  • Citations: 

    1
  • Views: 

    114
  • Downloads: 

    0
Keywords: 
Abstract: 

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 114

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 1 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2009
  • Volume: 

    7
  • Issue: 

    2
  • Pages: 

    154-161
Measures: 
  • Citations: 

    0
  • Views: 

    1005
  • Downloads: 

    0
Abstract: 

Pathological changes within an organ can be reflected as proteomic patterns in blood. The mass spectrometry has been used as powerful tools to generate proteomic patterns from serum. The produced profiles can be viewed as high dimensional and correlation data for which the features of scientific interest are the peaks. Due to this complexity of data, an appropriate analysis method is needed such as wavelet transform. In this study, we proposed a pseudo-covariance wavelet-based feature extraction method for dimension reduction and de-correlation between mass spectra data. Our algorithm was applied to datasets of ovarian cancer obtained from the National Cancer Institute of USA. The proposed algorithm was used to extract the set of proteins as potential biomarkers in each dataset from reconstructed mass spectra. The selected biomarkers were able to diagnose ovarian cancer patients from non-cancer with high accurate results using standard diagnosis criteria. Using different classification algorithms, our approach yielded an accuracy of 98%, specificity of 97%, and sensitivity of 98%.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 1005

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
litScript
telegram sharing button
whatsapp sharing button
linkedin sharing button
twitter sharing button
email sharing button
email sharing button
email sharing button
sharethis sharing button